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Abstract 

Background: Saccharomyces cerevisiae is able to adapt to a wide range of external oxygen conditions. Previously, 
oxygen-dependent phenotypes have been studied individually at the transcriptional, metabolite, and flux level. 
However, the regulation of cell phenotype occurs across the different levels of cell function. Integrative analysis of 
data from multiple levels of cell function in the context of a network of several known biochemical interaction types 
could enable identification of active regulatory paths not limited to a single level of cell function. 

Results: The graph theoretical method called Enriched Molecular Path detection (EMPath) was extended to enable 
integrative utilization of transcription and flux data. The utility of the method was demonstrated by detecting paths 
associated with phenotype differences of 5. cerevisiae under three different conditions of oxygen provision: 20.9%, 
2.8% and 0.5%. The detection of molecular paths was performed in an integrated genome-scale metabolic and 
protein-protein interaction network. 

Conclusions: The molecular paths associated with the phenotype differences of 5. cerevisiae under conditions of 
different oxygen provisions revealed paths of molecular interactions that could potentially mediate information 
transfer between processes that respond to the particular oxygen availabilities. 

Keywords: Saccharomyces cerevisiae, Network biology, Molecular path finding, Data integration, Oxygen, Constraint- 
based modeling 



Background 

The transcriptome is a realization of the genome of an 
organism whereas the fluxes are an ultimate response of 
the complete multilevel regulatory system of a cell. The 
correlation between the transcriptome and the fluxes is 
usually weak [1] since a substantial part of the regulation 
of cell physiology occurs at the post-transcriptional and 
metabolic levels [2], The regulation is mediated by inter- 
actions beyond individual levels of cell function. Active 
paths of regulatory interactions which determine the cell 
phenotype are concealed in data on cell components be- 
longing to different regulatory levels. Integration of these 
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data to frameworks of known interactions of multiple 
types could allow for a reconstruction of the regulatory 
paths associated with specific phenotypes. Genome-scale 
metabolic models build on the entity of metabolic en- 
zyme encoding genes in the genome. These models are 
already available for various organisms and provide 
frameworks of metabolic interactions to the extent of 
whole cells. Metabolic network context is being utilized 
to identify transcriptionally differentially regulated pre- 
defined pathways of enzymes sharing metabolites as sub- 
strates and products by parametric gene set enrichment 
analysis [3]. Full interconnectivity of metabolism is being 
applied in the identification of reporter metabolites, 
regulatory hot spots around which the most significant 
transcriptional changes have occurred [4]. Protein-protein 
interactions facilitate various kinds of information transfer, 
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e.g. a change in a localization or activity of a protein as a 
result of physical interaction or post-translational modifica- 
tion [5-7]. In particular, protein kinases serve as key regula- 
tors of nutrient sensing and signaling via protein-protein 
interactions. A network of interactions of key protein ki- 
nases of nutrient dependent regulation has been mapped, 
manually curated and annotated for the eukaryotic model 
organism S. cerevisiae [8]. A global network of protein kin- 
ase and phosphatase interactions that mediate information 
transfer via post-translational modifications is also available 
for S. cerevisiae [9] along with a large-scale data set on 
various types of physical protein-protein interactions [10]. 

Even other types of biochemical interactions, such as 
signaling and transcription factor interactions, also allow 
for communication between cellular components [11,12]. 

Previously, a graph-theoretical method called Enriched 
Molecular Path detection (EMPath) was developed in 
order to identify molecular interaction paths from multi- 
level interactome data [13]. The EMPath method was an 
extension of a "color coding" algorithm [14] which had 
earlier been used to detect signaling cascades based on 
edge reliabilities in protein-protein interaction networks 
[15] and more general structures, such as trees [16]. The 
developed EMPath method was applied to detect pheno- 
type specific molecular paths in type 1 diabetes mouse 
models in an integrated network of metabolic, protein- 
protein and signal transduction interactions scored with 
transcription data [13]. Recently, several graph theoret- 
ical methods for detection of molecular paths in an 
interaction network context have been developed. Gene 
Graph Enrichment Analysis (GGEA) integrates a known 
gene regulatory network in an analysis of transcription 
data and gains interpretability of the regulation processes 
underlying the gene expression response [17]. FiDePa 
(Finding Deregulated Paths) [18] and Topology Enrich- 
ment Analysis frameworK (TEAK) [19] find differentially 
expressed pathways between two cell phenotypes in sig- 
naling or regulatory networks and metabolic pathways, 
respectively. A method called Clipper exploits network 
topology to detect signaling paths within longer pathways 
based on differential gene expression between two pheno- 
types [20]. However, all these methods employ a single 
type of phenotypic information (i.e. transcription data), 
whereas post-transcriptional regulation has a recognized 
and substantial effect on a phenotype. Therefore, the EM- 
Path method was extended in this study to enable inte- 
grative simultaneous utilization of two data types, i.e. 
transcription and flux data in the context of a multi-level 
interaction network to detect enriched molecular paths as- 
sociated with phenotypic differences. 

Oxygen is a major determinant of physiology for the 
eukaryotic model organism S. cerevisiae. S. cerevisiae is 
able to remodel its energy generation and redox metab- 
olism according to the availability of oxygen in such a 



flexible way that it can grow under a wide range of oxygen 
availabilities from fully aerobic conditions to anaerobiosis. 
Characterization of the oxygen-dependent phenotypes of 
S. cerevisiae has previously been reported at the individual 
transcriptional, metabolite, and flux levels [21-23]. In this 
study, two case-control settings of the oxygen dependent 
phenotype differences of S. cerevisiae were defined. The 
phenotype under conditions of 20.9% 0 2 provision was 
compared to the phenotype under conditions of 2.8% 0 2 
provision, and the phenotype under conditions of 2.8% 0 2 
provision was compared to the phenotype under condi- 
tions of 0.5% 0 2 provision. Previously, it was noted that 
S. cerevisiae had highly similar flux distributions under 
conditions of 20.9% and 2.8% 0 2 provision [23], but inter- 
estingly there were substantial differences in the transcrip- 
tomes [21]. The phenotypes of S. cerevisiae possessed 
substantially different flux distributions under conditions 
of 2.8% and 0.5% 0 2 provision [23], whereas the transcrip- 
tomes of the phenotypes were surprisingly similar [21]. 
Thus, transcription and flux data were integratively uti- 
lized to find enriched molecular interaction paths associ- 
ated with the aforementioned differences in the previously 
observed oxygen-dependent phenotypes [21-23]. The path 
detection was performed in a combined network of meta- 
bolic [24-26] and protein-protein interactions (Search 
Tool for the Retrieval of Interacting Genes database 
(STRING): [27]) of S. cerevisiae. 

Methods 

Overview 

Figure 1 illustrates the overall pipeline of the study. 
First, a genome-scale metabolic network model and the 
protein-protein interactions including the global kinase- 
phosphatase interactions [9] were integrated into a sin- 
gle interaction network. Then, flux and transcription 
data were assigned to node weights to set the network 
into a phenotypic context. Then, the EMPath method 
was used to detect enriched up- and down-regulated 
molecular interaction paths within the network. In the 
end, the paths were visualized as integrated networks and 
enriched with previously known functional categories. 

Network representation 

The integrated network of metabolic and protein-protein 
interactions comprised of a recently refined version [24] of 
the yeast whole genome metabolic model, protein-protein 
interactions from the STRING database [27], and a kinase- 
phosphatase interaction network [9]. From the STRING 
database the protein interactions with an experimental 
score greater than 900 were included, thus excluding inter- 
actions with low experimental evidence. The integrated 
network representation is illustrated in Figure 1. In this 
representation the metabolic reactions of the genome- 
scale model [24] are nodes and there is an edge between 
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Figure 1 Overall workflow of the study comprising the following main steps. • genome-scale metabolic network model and protein-protein 
interactions, including kinase-phosphatase interactions, were integrated into single network representation. • phenotypic context from fluxome 
and transcriptome data incorporated into the network. • EMPath used for detecting up-and down-regulated paths. • detected paths were 
visualized and enriched with previously known functional categories. 



two reactions if they share a metabolite, i.e. having either a 
common substrate or product. Cofactors and other metab- 
olites not participating in the metabolic conversions with 
their carbon backbone were excluded from the network. 
The excluded metabolites are listed in Additional file 1. All 
edges were modeled with undirected edges. Each reaction 
comprised a set of gene(s) that encodes an enzyme that 
catalyzes the reaction. Protein-protein interactions were 
integrated with nodes representing enzymatic reactions if 
the metabolic enzymes had reported protein-protein inter- 
actions. In total, the whole integrated network comprised 
5 702 nodes and 41 525 edges. 

Transcription and flux data 

Wiebe et al. (2008) grew S. cerevisiae in glucose-limited 
chemostat cultivations at a dilution rate of 0.1 h" 1 under 
different oxygenation conditions (i.e. 20.9%, 2.8%, 1.0% 
and 0.5% 0 2 ) in the chemostat inlet gas to obtain the 
oxygen dependent phenotypes [22]. Rintala et al. (2009) 
performed the analysis of the transcriptomes of S. cerevi- 
siae under the different conditions of oxygen provision 
[21]. The normalized transcription dataset was stored in 
the Gene Expression Omnibus (GEO) database [28] with 



the accession number GSE12442. In the present study, 
all four replicates of transcription data from each of the 
steady state cultures with 20.9%, 2.8%, and 0.5% 0 2 in 
the chemostat inlet gas were used to determine the tran- 
scription scores for the detection of molecular paths. 

Genome-scale flux distributions were sampled from the 
solution space of a genome-scale metabolic model of S. 
cerevisiae by Monte Carlo sampling using Artificial Hit- 
And-Run (ACHR) sampler [29]. Prior to the sampling, the 
genome-scale metabolic model of S. cerevisiae was im- 
proved by further refinement of its oxygen dependent me- 
tabolism [24] (Additional file 1). The model was also 
constrained with P/O ratios dependent on a specific oxy- 
gen uptake rate (OUR) [23] and experimental data re- 
ported on extracellular fluxes, i.e., growth rate, substrate 
consumption rates and product secretion rates [22]. The 
Carbon Evolution Rate (CER), resulting from carbon diox- 
ide production at various sites in metabolism, was allowed 
to vary freely to introduce flexibility to the system since 
the remaining secretion rates were set to zero. However, 
the introduction of the exact experimental rate con- 
straints resulted in an infeasible solution space. Thus, 
the lower and upper bound constraints derived from 



Lindfors et al. BMC Systems Biology 2014, 8:16 
http://www.biomedcentral.eom/1 752-0509/8/1 6 



Page 4 of 16 



the extracellular growth, glucose uptake, and ethanol 
secretion rates were simultaneously and gradually ex- 
panded until a feasible flux solution existed. At each 
step the constraints were expanded with 10% of the par- 
ticular SEMs (Standard Error of the Mean) of the extra- 
cellular rates [22] (see Additional file 1 for the final 
constraints). OUR and P/O ratio constraints were kept as 
strict constraints since the oxygen uptake rates followed 
from the provision of oxygen in the chemostat inlet gas, 
which was the only experimental parameter changed in 
the bioreactor cultivations resulting in the three different 
phenotypes of S. cerevisiae [22] that were investigated in 
this study. Further, P/O ratios of S. cerevisiae dependent 
on OUR were previously determined [23] and used here. 
The Monte Carlo sampling of flux distributions was per- 
formed with the ACHR sampler [29] implemented in the 
COBRA Toolbox [30]. A threshold for the reactions with 
non-zero fluxes was set to a minimum of 10" 7 mmol/(g 
CDW h). Zero fluxes were assigned to the rest of the reac- 
tions. A total of 10 000 feasible points were collected in 
the solution space out of which 2 000 samples were ran- 
domly selected for the calculation of mean fluxes. The 
mean values of unconstrained CER in the flux distribution 
samples differed from 4% to 13% from the experimental 
values. 

Combining network and phenotypic data 

Previously, only transcription data was used as phenotypic 
data in the detection of enriched molecular paths [13]. 
Here the EMPath method was extended for integrative 
utilization of transcription and flux data having separate 
weights: w(trans), and w(flux), respectively. More specific- 
ally w(trans) is defined in Formula (1) in which trans - in- 
tensity{case) and trans - intensity {control) are case and 
control intensities of mRNA expression level averaged 
over all replicates, respectively. In the genome-scale meta- 
bolic model of S. cerevisiae the gene regulatory rules are 
expressed by AND-and OR-operands for the metabolic re- 
actions (e.g. multi-protein complex as catalyst) that have 
more than one encoding gene [25]. If there was an OR- 
operand between two genes, then a mean intensity was 
calculated and if there was an AND-operand, then a mini- 
mum intensity was taken. Since there is no transcriptome 
data for non-enzymatic reactions (i.e. they do not require 
a catalyzing enzyme or an encoding gene to occur), neu- 
tral weights (i.e. zero) were assigned for them. 



corresponding to a feasible flux distribution (see Tran- 
scription and Flux data above). 



w (trans) = log2 



trans- intensity (case) 
trans -intensity (control) 



(i) 



The weight derived from the flux data for each reac- 
tion, w(flux), is defined in Formula (2) in which flux 
(case) and flux (control) were obtained by calculating av- 
erages over the 2 000 randomly selected samples, each 



w(flux) = log2 



flux(case) 
flux(control) 



(2) 



The total score for the node is defined in Formula (3). 
When the two data types were simultaneously used, w 
(trans) and wiflux) were scaled to be in the same inter- 
val, which was essential to prevent either of them from 
being over-represented in the detected molecular paths. 
In practice, the flux data was scaled to have the same 
range as the transcription data: {-2.71, 4.75} for 2.8% vs. 
0.5% oxygen in the bioreactor inlet gas and {-3.31, 4.97} 
for 20.9% vs. 2.8% in the bioreactor inlet gas. Flux data 
was naturally not available for signaling proteins (i.e. 
non-metabolic proteins), thus their scores were calcu- 
lated solely from the transcription data. 



w(tot) = a * w(flux) + (1-a) * w(trans), 
a = {0,0.5,1} 



(3) 



The motivation of using parameter a was to allow for 
relative weighting for the flux and transcription data in 
the detection of molecular paths e.g. weighting with pure 
transcription data: a = 0, or pure flux data: a = 1, or their 
simultaneous utilization with an equal weight: a = 0.5. 

Molecular path detection 

After the weights were assigned to the nodes, the EMPath 
method [13] was used to detect an optimal path of length 
k. The algorithm is initialized by assigning colors, i.e. ran- 
dom integer numbers [1, k], to the nodes of the path. Then 
a node with a maximum weight score is added to be the 
first node in the path. Then the neighboring nodes to the 
recently added node are considered to be the next node in 
the path. From this set a node with a maximum weight 
score is added to the path but nodes with a color that is 
already included in the path are ignored. Nodes are added 
until there are k nodes in the path. Then a score of the 
path is calculated by summing up all the node weights. 

In order to calculate the p-value for the null hypothesis 
(i.e. that the detected path is obtained by chance), a ran- 
dom distribution was created by shuffling the node weights 
1 000 times. After each shuffle, a path was detected and its 
score was calculated as described above. In this way, 1 000 
optimal path scores in a random network were obtained 
resulting in a random distribution. A p-value for the null 
hypothesis that the detected path is obtained by chance 
was defined by comparing its score to the random distribu- 
tion. 0.025 was used as a cut-off p-value, i.e. paths of higher 
p-values were not considered significant. A network was 
considered harvested from optimal paths if there were 
i consecutive iterations in which the detected path was de- 
tected during previous iterations. 
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The path detection was performed separately for up-and 
down-regulated paths in both case-control comparisons 
(20.9% vs. 2.8%, and 2.8% vs. 0.5% 0 2 in the bioreactor in- 
let gas), and for each value of parameter a e {0, 0.5, 1}. 
When the up-regulated paths were detected, case-control 
ratios were used, and when the down-regulated paths were 
detected, control-case ratios were used. Eight (8) was used 
as the path length k. There is not any rigorous way to de- 
fine the proper value for this parameter. Eight (8) was em- 
pirically found to be a proper value for this parameter: 
smaller values (e.g. 7) led to too sparse combined net- 
works of enriched molecular paths and higher values (e.g. 
9) led to very dense combined networks of enriched mo- 
lecular paths which would have had poor interpretability. 
In similar vein, ten (10) was selected for parameter i on 
empirical basis: the higher values did not harvest the net- 
work significantly more thoroughly. The path detection 
calculations were implemented in a C++ environment and 
were processed on an Ubuntu Linux Server with 2 proces- 
sors of Intel Xeon X5650 2.66 GHz divided in 24 virtual 
cores and 70 GB of RAM memory. 

Enrichment of functional protein categories 

In order to study how pre-established cellular functions 
were associated with the detected molecular paths, the 
combined networks were associated with functional pro- 
tein categories from FunCat [31] by making a hypergeo- 
metric test with controlling false discovery rate (FDR) [32] 
q-value 0.05 as a cut-off, as described in [13]. Open reading 
frame identifiers (ORF) were used to identify the genes. 

Path length 

The method required a selection of pre-defined path 
length, which is heuristic and deserves some discussion. 
Let us assume that the network comprises n nodes, and 
for simplicity they are assumed to be fully connected to 

each other. In this case the network comprises {j/^j P atns 

of length /<, in which n » k. The higher the length k is 
the more paths the network comprises. Thus, a too small 
path length would lead to information poor networks. On 
the other hand, a drawback of a long path length is that 
the computational enumeration and the interpretation of 
crowded combined networks gets heavy. Eight was se- 
lected as the path length since it is the shortest length that 
provides paths which reasonably combine both metabolic 
and protein-protein interactions in all the studied cases. 

Results and discussion 

Effect of relative weighting of transcription and flux data 
on the detected molecular paths 

The detected molecular interaction paths combined 
protein-protein interactions and metabolic interactions 



dependent on the phenotypes compared and the relative 
weighting used to combine the transcription and flux data. 
The numbers of protein-protein interactions (PPI) and 
metabolic edges in the combined networks of the detected 
molecular paths for each of the phenotype comparisons 
are shown in Table 1. Metabolic edges prevailed when a = 
1 (i.e. only flux data used) in all comparisons except "2.8% 
vs. 0.5%, down" where there were as many PPI edges as 
metabolic edges. When the metabolic edges prevailed the 
detected paths generally followed the metabolic routes in 
which the fluxes had changed substantially. The neighbor- 
ing metabolic reactions had correlated flux weights as the 
result of the steady state flux data being constrained by 
metabolic network stoichiometry. There were two com- 
parisons ("2.8% vs. 0.5%, down" and "20.9% vs. 2.8%, 
down") in which PPI edges prevailed when a = 0 (i.e. only 
transcription data used) indicating that in these com- 
parisons metabolic pathways were less coherently tran- 
scriptionally down-regulated than the paths following 
protein-protein interactions. 

Peroxisomal activities and oxidative stress response 
featured in the upregulated interaction paths of 
phenotype differences between the fully respirative 
phenotype of S. cerevisiae and the respirofermentative 
phenotype at 2.8% oxygenation 

Wiebe et al. (2008) had previously observed that the me- 
tabolism of S. cerevisiae was fully respirative under condi- 
tions of 20.9% 0 2 in the bioreactor inlet gas whereas 
under conditions of 2.8% 0 2 in the bioreactor inlet gas the 
metabolic state was respirofermentative [22]. However, the 
drop in the specific Oxygen Uptake Rate (OUR) was small, 
from 2.7 ± 0.04 to 2.5 ± 0.04 mmol/(g CDW h) [22] and 
Jouhten et al (2008) observed that the flux distributions 
remained almost constant except for the subtle flux to 
ethanol production [23]. Nevertheless, major changes be- 
tween the two phenotypes have been observed at the tran- 
scriptional level [21]. The transcription and flux data for S. 
cerevisiae during steady state growth conditions at 20.9% 
and 2.8% oxygen provision were analyzed here in an inte- 
grative manner and separately with the EMPath method 
to detect molecular interaction paths that were possible 
determinants of the phenotypic differences observed in S. 
cerevisiae growing under the two different oxygenation 
conditions. When transcription data on S. cerevisiae grow- 
ing under fully aerobic conditions and under conditions of 
2.8% 0 2 in the bioreactor inlet gas was solely used in the 
scoring of the up-regulated nodes in the detection of mo- 
lecular interaction paths, cellular processes of respirative 
metabolism, fatty acid oxidation, and oxidative stress 
defense were represented in the paths (Figure 2, FunCat 
enrichments in Additional file 1). Glyoxylate pathway en- 
zyme isocitrate lyase encoded by ICL1 and a dicarboxylate 
carrier transporting succinate from glyoxylate cycle into 
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Table 1 Size information of detected paths and combined network in each comparison 



Comparison (percentage of oxygen in the bioreactor 
inlet gas) 
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21 


13 
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Column 1: The number of PPI edges. 
Column 2: The number of metabolic edges. 
Column 3: The fraction of PPI edges from all edges. 
Column 4: The fraction of metabolic edges from all edges. 
Column 5: The number of nodes in the combined network. 
Column 6: The number of detected paths. 



mitochondria to be incorporated into TCA cycle encoded 
by DIC1 [33] appeared in the molecular paths up- 
regulated at the level of gene expression. The glyoxylate 
cycle is known to be induced in S. cerevisiae under respira- 
tive conditions for the metabolism of non-fermentative 
carbon sources [34]. In addition, the methylisocitrate lyase 
reaction catalyzed by an enzyme encoded by ICL2, which 



is homologous to ICL1, was also included in the detected 
molecular paths. Isocitrate dehydrogenase encoding IDP2 
was connected via isocitrate to isocitrate lyase of the glyox- 
ylate cycle. The IDP2 encoded isoform is an alternative 
source of cytosolic NADPH, for the pentose phosphate 
pathway, but only while the metabolic state is respira- 
tive [35]. Succinate interconnected the glyoxylate cycle 
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Figure 2 Detected up-regulated molecular paths combined into one network, 20.9% vs 2.8%, only transcription data used . 



Lindfors et al. BMC Systems Biology 2014, 8:16 
http://www.biomedcentral.eom/1 752-0509/8/1 6 



Page 7 of 16 



components further to SHH3 (YMR118C) (fold change 
5.0) encoding a putative mitochondrial inner membrane 
protein [36]. SHH3 was linked via a protein-protein 
interaction to ubiquinone-6 dependent succinate de- 
hydrogenase. Succinate dehydrogenase was expectedly 
the only respiratory chain coupled component observed 
since most of the respiratory chain components in S. 
cerevisiae are expressed on a lower level under fully 
aerobic conditions than in conditions of lower oxygen 
provision [21]. In addition to the respirative metabol- 
ism, fatty acid beta oxidation was observed in the de- 
tected molecular paths. Beta oxidation of fatty acids 
occurs in peroxisomes in yeast and provides an alterna- 
tive energy source for S. cerevisiae under aerobic condi- 
tions. Accordingly, PEX14, which is involved in the 
import of peroxisomal proteins [37], had protein- 
protein interactions with the components of fatty acid 
beta oxidation in the detected paths. Both peroxisome 
biogenesis and fatty acid beta oxidation are under regu- 
lation by SNFlp kinase, a coordinator of energy metabol- 
ism of S. cerevisiae [38]. The transcriptional regulation of 
the peroxisome biogenesis and fatty acid beta oxidation 
also involves the common regulators ADRlp, OAFlp, and 
PIP2p. Rintala et al (2009) showed that the genes involved 
in fatty acid beta oxidation and peroxisomal biogenesis 
were expressed at higher levels under the fully aerobic 
conditions than in conditions of any lower oxygen 
provision [21]. In the detected molecular interaction paths 
PEX14 was further linked to regulators of protein folding 
(HSP42, SIS1, SSA3) in particular in response to stress, 
which share a YAPlp binding site [YEAS TRACT database 
July 16, 2013; [39-41]]. YAPlp is a transcription factor re- 
sponsive to oxidative stress. In the detected molecular 
paths fatty acid beta oxidation was connected to oxidative 
stress defense via CTA1 which encodes for a catalase re- 
quired for the removal of hydrogen peroxide, a strong oxi- 
dant, in the peroxisomal matrix. Hydrogen peroxide is 
formed as a byproduct in the beta oxidation of fatty acids. 
CTAlp was further linked to a cytosolic catalase reaction 
involved in the defense against oxidative damage encoded 
by CTT1 (fold change 4.6) and a hydrogen peroxide re- 
ductase reaction that mediates the maintenance of cellular 
redox balance. Koerkamp et al. (2002) has observed an in- 
duction of peroxisomal fatty acid oxidation to trigger tran- 
sient YAPlp mediated oxidative stress response [42]. 
However, the transient oxidative stress response did not 
induce an expression of CTT1 and CTA1 co-responded 
non-transiently with other genes involved in the peroxi- 
somal functions. Here, the up-regulation of the defense 
against oxidative agents linked to the up-regulation of per- 
oxisomal activities via molecular interaction paths in S. 
cerevisiae cells provided with air compared to cells pro- 
vided with 2.8% oxygen in the chemostat inlet gas, sug- 
gests that S. cerevisiae co-regulates these activities. The 



peroxisomal activities and oxidative stress defense could 
be down-regulated either directly in response to the de- 
creased oxygen availability though it did not result in 
substantially lowered oxygen uptake rate (2.7 mmol/(g 
CDW h) vs 2.5 mmol/(g CDW h) under provision of 
20.9% vs 2.8% oxygen, respectively [22]), or in response 
to the induced fermentative metabolism in cells pro- 
vided with 2.8% oxygen in the chemostat inlet gas. 

Acetyl-CoA synthesis and shuttling were interconnected 
to the CTT1 encoded catalase and defense against oxida- 
tive agents via protein-protein interactions and a guanine 
nucleotide exchange factor MUKlp which is involved in 
protein trafficking [43]. MUKlp had a protein-protein 
interaction to carnitine o-acetyltransferase of the carnitine 
shuttle which is active both in peroxisomes and in mi- 
tochondria. The carnitine shuttle transfers acetyl-CoA 
across peroxisomal and mitochondrial membranes. CAT2 
encodes the carnitine o-acetyltransferase in S. cerevisiae 
and was coupled to an acetyl-CoA synthetase isoform 
encoded by ACS1, which is induced under respirative me- 
tabolism in S. cerevisiae [44]. ACS1 was down-regulated 
when 2.8% 0 2 was provided compared to fully aerobic 
conditions, even though the metabolism of S. cerevisiae 
was mainly respirative. The localization of the ACS1 
encoded acetyl-CoA synthetase has been very unclear 
until recently when Chen et al (2012) confirmed at least a 
distributed localization of the ACS1 encoded enzyme be- 
tween cytosol and peroxisomes [45]. However, ACSlp has 
also been observed in the mitochondrial proteome [46]. 
Perhaps the down-regulation of ACS1 in response to the 
subtle decrease in the oxygen uptake rate under condi- 
tions of 2.8% 0 2 provision was related to a general down- 
regulation of the peroxisomal activities. Remarkably, the 
decreased oxygen provision which resulted in a mild de- 
crease in the respiratory activity [21-23] triggered the 
down-regulation of peroxisomal functions coupled to the 
fatty acid beta oxidation whereas a respiratory deficiency 
in an absence of oxygen limitation has been observed to 
trigger an opposite response, an up-regulation of peroxi- 
somal activities [47]. 

When both transcription and flux data were used to 
score the nodes of the network in the EMPath method, 
the molecular paths up-regulated in the fully respirative 
phenotype of S. cerevisiae compared to the respirofermen- 
tative phenotype observed under 2.8% oxygenation [22] in- 
cluded key enzymes of respirative metabolism i.e. pyruvate 
dehydrogenase, the gate keeper of the TCA cycle, and cit- 
rate synthase (Figure 3, FunCat enrichments in Additional 
file 1). They were linked to the ACS1 encoded acetyl- Co A 
synthetase which was observed in the enriched molecular 
paths when the path detection was run solely with the 
transcription data. Further connections were observed 
to the mitochondrial NAD + dependent and cytosolic 
NADP + dependent isoforms of acetaldehyde dehydrogenase 
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encoded by ALD4 and ALD6, respectively [48,49]. Both 
the ALD4 encoded isoform and the ALD6 encoded iso- 
form, which is an additional source of cytosolic NADPH, 
had lower mRNA and protein levels under oxygen limita- 
tion than under fully aerobic conditions [21]. The mRNA 
and protein levels of ALD4 and ALD6 encoded acetalde- 
hyde dehydrogenase isoenzymes correlated within five dif- 
ferent conditions of oxygen provision from fully aerobic to 
anaerobic. Here flux estimation also suggested changes in 
the fluxes of the reactions catalysed by both isoforms. The 
succinate dehydrogenase reaction, which is closely coupled 
to the respiratory chain, showed an altered flux response 
between the compared conditions and was observed in the 
detected paths when only the transcription data was used 
in scoring. However, the glyoxylate cycle components and 
components involved in the peroxisomal fatty acid beta 
oxidation were absent from the molecular paths when the 
flux data was included in the scoring. The glyoxylate 
cycle is under glucose repression [34] and no in vivo ac- 
tivity of the glyoxylate cycle in S. cerevisiae was pre- 
viously observed in the 13 C-labelling experiments on 
glucose either under fully aerobic conditions or in 2.8% 
oxygenation [23]. 



Scoring the nodes of the interaction network solely with 
flux data resulted in molecular interaction paths dominated 
by components of sphingolipid metabolism and protein- 
protein interactions between them (Additional file 2: 
Figure SI; FunCat enrichments in Additional file 1). Ex- 
pression of SUR2 and SCS7 encoded hydroxylases in- 
volved in the biosynthesis of sphingolipids has been found 
to be oxygen-dependent [50,51]. Thus, OUR may have 
had an effect on the in vivo activity of the sphingolipid 
biosynthesis pathway. Sphingolipid metabolism has been 
associated with ageing and apoptosis [52] which were ob- 
served in the FunCat enrichments of the detected molecu- 
lar paths. 

Down regulated interaction paths of phenotype 
differences between fully respirative phenotype of S. 
cerevisiae and respirofermentative phenotype at 2.8% 
oxygenation involved regulation of the cell cycle at the 
transcriptional level 

Components of fermentative metabolism, alcohol dehydro- 
genases in particular, were present in the down-regulated 
molecular paths in the fully respirative phenotype of S. cer- 
evisiae compared to the respirofermentative phenotype of 
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Figure 3 Detected up-regulated molecular paths combined into one network, 20.9% vs 2.8 7 both transcription and flux data used . 
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S. cerevisiae under the 2.8% oxygenation conditions when 
both transcription and flux data were incorporated into 
the scores (Figure 4, both transcription and flux data used 
in the scoring; Additional file 2: Figure S2, scoring with 
pure flux data; FunCat enrichments in Additional file 1). 
When only transcription data was used in the scoring, a 
separate, interconnected, network of regulatory compo- 
nents was observed (Figure 5). The regulatory components 
were involved in the mating pathway and in the regulation 
of the cell cycle (FunCat enrichments in Additional file 1). 
The separate regulatory network was linked via protein- 
protein interactions to IMP dehydrogenase and, thus, to 
nucleotide synthesis. 

Notably, alcohol dehydrogenase was found in the de- 
tected molecular paths only when flux data was in- 
cluded in the scoring even though alcohol production 
was a major phenotypic difference between S. cerevisiae 
under fully aerobic and conditions or 2.8% oxygen 
provision. This emphasizes the benefit of integrated 
data from a post-transcriptional regulatory level into 
the analysis. 



Upregulated molecular interaction paths detected in S. 
cerevisiae between the respirofermentative phenotypes at 
2.8% oxygenation and 0.5% oxygenation suggest 
remodelling of transport across the mitochondrial 
membrane 

The metabolic state of S. cerevisiae was respirofermenta- 
tive under both conditions: 2.8% and 0.5% 0 2 in the bio- 
reactor inlet gas [22] and the transcriptomes of S. 
cerevisiae were observed to be similar under these two 
conditions [21]. However, the flux distributions were 
substantially different [23]. Under the 0.5% oxygenation 
conditions the yield of ethanol on glucose exceeded the 
yield of biomass on glucose, and pyruvate decarboxylase 
carried the main flux from pyruvate branching point in 
contrast to the subtle ethanol production of S. cerevisiae 
under 2.8% oxygenation conditions [23]. The detected 
molecular paths up-regulated in S. cerevisiae under the 
2.8% oxygenation conditions compared to the 0.5% oxy- 
genation conditions when the transcription data was 
solely used to score nodes, featured a remodeling of 
transport between the cytosol and mitochondria, and 
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Figure 4 Detected down-regulated molecular paths combined into one network, 20.9% vs 2.8%, both transcription and flux data used . 
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respirative metabolism (Figure 6; FunCat enrichments in 
Additional file 1). The remodelling of respirative metab- 
olism at the transcriptional level was progressive as a 
function of oxygenation since the glyoxylate cycle com- 
ponents and ACS1 encoded acetyl-CoA synthetase and 
isocitrate dehydrogenase encoded by IDP2 were ob- 
served also in the molecular paths representing the dif- 
ferences of the response of S. cerevisiae to fully aerobic 
conditions and conditions of 2.8% oxygen provision. The 
glyoxylate cycle was represented in the molecular paths 
detected for the differences of S. cerevisiae phenotypes 
within 2.8% and 0.5% oxygenation conditions by both 
malate synthase encoded by MLS1 and isocitrate lyase. 
In addition, components of the propionate catabolic 
pathway, which resembles the glyoxylate cycle, including 
a 2-methylcitrate synthase encoded by CIT3, aconitase 
encoded by PDH1, and methylisocitrate lyase encoded 
by ICL2 were observed in the paths. Methylisocitrate 
lyase cleaves methylisocitrate into succinate and pyru- 
vate which integrate to the TCA cycle. Propionate catab- 
olism is generally under glucose repression [53] but 
PDH1 has also been observed to be regulated by retro- 
grade regulators and induced in mitochondrial dysfunc- 
tion [47]. However, here, during decreased respiratory 
activity due to a limited availability of oxygen, PDH1 was 
down-regulated. Interestingly, a number of transports 



between the cytosolic and mitochondrial compartments 
were observed in the detected molecular paths. The 
transporters were carriers of the intermediates of TCA 
cycle, and acetate and CoA. Proton gradient across the 
mitochondrial membrane affects the molecule and ion 
transport since many of the transporters are proton 
symporters or antiporters. The appearance of the trans- 
porters in the up-regulated molecular paths suggests 
that in 0.5% oxygenation conditions the low availability 
of oxygen may have limited the generation of proton 
gradient across the mitochondrial membrane by the 
electron transfer chain of S. cerevisiae and, thus, the 
transport required reorganization. 

When both transcription and flux data were used in the 
scoring of nodes up-regulated in S. cerevisiae under the 
2.8% oxygenation conditions compared to the 0.5% oxy- 
genation, additional components involved in aerobic me- 
tabolism such as fructose 6-phosphatase, a gluconeogenetic 
enzyme, encoded by FBP1 and pyruvate dehydrogenase 
complex were observed among others (Figure 7; FunCat 
enrichments in Additional file 1). Again, the glyoxylate 
cycle components were absent when flux data was included 
in the scoring whereas the components involved in propi- 
onate metabolism were observed. 

Mevalonate biosynthesis prevailed in the detected up- 
regulated molecular paths when only flux data was used to 
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Figure 6 Detected up-regulated molecular paths combined into one network, 2.8% vs 0.5%, only transcription data used*, 



score the nodes (Additional file 2: Figure S3; FunCat en- 
richments in Additional file 1). In addition, acetaldehyde 
dehydrogenase isoforms encoded by ALD4 and ALD5 
catalyzing the mitochondrial NADP + specific and cytosolic 
NAD + specific reactions were observed. Most of the meta- 
bolic interactions in the detected paths involved either 
acetyl-CoA or CoA. 

Potential post-transcriptionally co-regulated reactions 
found in the downregulated molecular interaction paths 
detected in S. cerevisiae between the respirofermentative 
phenotypes at 2.8% oxygenation and 0.5% oxygenation 

When both flux and transcription data were used in the 
scoring of nodes down-regulated in S. cerevisiae under the 
2.8% oxygenation compared to the 0.5% oxygenation, key 
enzymes of the central carbon metabolism, glucose-6- 
phosphate isomerase, fructose bisphosphate aldolase, 
phosphoglycerate kinase, pyruvate decarboxylase, and al- 
cohol dehydrogenase were observed in the detected mo- 
lecular paths (Figure 8). These enzymes, involved in the 
glycolytic pathway, pyruvate metabolism, and fermentative 
pathway (FunCat enrichments in Additional file 1), are 
not directly linked by metabolic interactions, but were 
connected by protein-protein interactions in the detected 
molecular paths. Collins et al (2007) reported in their 
high-throughput study the protein-protein interactions 
between glucose 6-phosphate isomerase (PGIlp), fructose 
bisphosphate aldolase (FBAlp), 3-phosphoglycerate kinase 
(PGKlp), pyruvate decarboxylase (PDClp), and alcohol 
dehydrogenase (ADHlp) [54]. The genes encoding the 



discussed enzymes, i.e. FBA1, PGK1, PDC1, and ADH1, 
have all been observed to have stable expression under a 
range of conditions [55]. However, the fluxes of glucose 6- 
phosphate isomerase, fructose bisphosphate aldolase, 
3-phosphoglycerate kinase, pyruvate decarboxylase, and 
alcohol dehydrogenase reactions were substantially lower 
under 2.8% oxygenation conditions than under even lower 
oxygen availability [23] whereas the corresponding tran- 
script levels did not, as expected, show consistent behavior 
[21]. On the other hand, the level of FBAlp is under post- 
transcriptional control by 14-3-3 proteins BMHlp and 
BMH2p [56]. In fact, post-transcriptional regulation was 
previously observed to have a major effect on the protein 
levels in S. cerevisiae under the conditions of 0.5% 0 2 in 
the bioreactor inlet gas [21]. If the physical interactions 
between these enzymes mediate a transfer of information 
in some form, they enable coordinated regulation of the 
central carbon metabolism in upper and lower glycolysis, 
and in the fermentative pathway. The information transfer 
could occur for example via a common post-translational 
modification occurring while the proteins interact. Not- 
ably, all these enzymes contain identified phosphorylation 
sites (www.phosphopep.org) [57] and a differential phos- 
phorylation of one of the enzymes, fructose bisphosphate 
aldolase (FBAlp), in response to switch in growth condi- 
tions was recently observed by Oliveira et al (2012) [58]. 
Protein-protein interactions interconnected the enzymes 
of central carbon metabolism further to fatty acid import 
and biosynthesis. The detected molecular interaction 
paths included FAS1 and FAS2 that are involved in the 
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Figure 7 Detected up-regulated molecular paths combined into one network, 2.8% vs 0.5%, both transcription and flux data used* 



elongation of saturated fatty acids, and FAA1 and FAA4 
encoding enzymes catalyzing the import and activation of 
unsaturated fatty acids available in the growth medium. 
The detected down-regulated molecular paths were highly 
similar involving the components of the central carbon 
metabolism when pure flux data was used in the scoring 
(Figure 9). If flux data was not incorporated into the scor- 
ing, only amino acid transport was observed (Additional 
file 2: Figure S4; FunCat enrichments in Additional file 1). 
The observation emphasized the value of the integrative 
analysis of transcription and flux data that reflect the 
states of different functional levels of cells. 

Conclusions 

In this study, the EMPath method for the detection of 
molecular interaction paths [13] was extended to allow 
for simultaneous utilization of transcriptome and flux- 
ome data in an integrative manner. The method was ap- 
plied to a combined network of S. cerevisiae s metabolic 
and protein-protein interactions. In contrast to existing 
path finding methods [13,17-20,59], data from two 
sources were combined into one weighting scheme. 
Thus, the identification of potentially information 



transferring molecular paths beyond a single functional 
level of cells was enabled. The molecular paths coupled 
cellular components and processes distant at first sight 
but associated through different biochemical interactions 
with the oxygen-dependent phenotype changes in S. cere- 
visiae. New light was shed on the S. cerevisiae phenotypes 
previously investigated separately with transcription and 
on the level of in vivo fluxes [21-23]. However, it was ob- 
served that while the combined weighting scheme was 
of profound interest, all the three different weighting 
schemes resulted in enriched molecular paths providing 
complementary insight into the oxygen-dependent pheno- 
types of S. cerevisiae. In addition, certain processes were 
dominated by post-transcriptional level regulation i.e. 
glycolytic and fermentative fluxes were emphasized by the 
differences observed in the enriched molecular paths de- 
tected with the different weighting schemes. In particular, 
the detected molecular paths highlighted protein-protein 
interactions between the enzymes of central carbon me- 
tabolism that could possibly mediate coordinated post- 
transcriptional regulation of the differential in vivo activity 
of central metabolism in S. cerevisiae in two different 
respirofermentative metabolic states. Further, the down- 
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Figure 8 Detected down-regulated molecular paths combined into one network, 2.8% vs 0.5%, both transcription and flux data used*. 
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regulation of oxidative stress in S. cerevisiae in conditions 
of 2.8% oxygenation compared to fully aerobic conditions 
was found to be related and potentially restricted to the 
down-regulation of peroxisomal activities. The results fur- 
ther suggested that a limited availability of oxygen and the 
consequently decreased respirative activity may affect 
transport reactions of S. cerevisiae across the mitochondrial 
membrane under conditions of 0.5% oxygen provision. Fi- 
nally, the paths included metabolic interactions via meta- 
bolic intermediates in the crossroads of altered processes, 
such as acetyl-CoA and succinate, whose concentrations 
could be potential phenotypic markers. 
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